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I. INTRODUCTION 

Extracting free energy differences from a suitable set 
of computer simulation data is an active field of research 
and of interest for e.g. drug design [l[ or nonperturbative 
quantum chromodynamics 0] ■ Concerning estimators for 
the free energy difference, an extensive literature can be 
found. Probably the most elementary estimator is the 
traditional free energy perturbation 3*, which is briefly 
introduced in the following. Assume we have given two 
systems, arbitrarily labeled as system and system 1, 
that are characterized by Hamiltonians Hq{x) and Hi{x), 
respectively, depending on the point x in phase space. 
Further, let Pi{x) denote the thermal equilibrium phase 
space density of system i, 

P'^{x) = y , ^-0,1, (1) 

where = J e~'^^''-^^c?a; denotes the partition function 
and /3 = the inverse temperature. We are interested 
in the Helmholtz free energy difference AF of the sys- 
tems, defined as AF = —i\n^. Traditional free energy 
perturbation (3| originates from the equality 

Pi{x) 

with AH{x) :— Hi{x) — Hq{x). The latter quantity may 
be interpreted as the work performed during an infinitely 
fast switching process transforming system to system 
1, with initial configuration a; A direct consequence 
of Eq. (12) is the perturbation identity 

e'f'^^ ^ Jc-^'^"^^^p,{x)dx, (3) 

which is frequently used to obtain an estimate of AF 
in drawing a sample {xi, . . . , xjv} from Pq{x) (e.g. by 
Monte Carlo simulations) and evaluating the estimator 

- — -trad 1 ^ , — r 

AFq = --lne-/3^-f^(^). (4) 



The overbar denotes a sample average [ i.e. f{x) — 
X^fcLi fi^k) where / stands for an arbitrary function]. 



As can be seen by comparison with Eq. ^ , the integrand 
appearing in Eq. ([3]) is proportional to pi, and thus the 
main contributions to an accurate estimate of AF with 
Eq. (HI) come from realizations x (drawn from po) that 
are typical for the density pi. This means that the per- 
formance of such an estimate depends strongly on the 
degree of overlap of po with pi. If the overlap is small, 
the traditional free energy perturbation is plagued with 
a slow convergence and a large bias. This can be over- 
come by using methods that bridge the gap between the 
densities pq and pi, for instance the thermodynamic in- 
tegration. Since thermodynamic integration samples a 
sequence of many equilibrium distributions, it soon be- 
comes computationally expensive. Another method is 
umbrella sampling Q which distorts the original distri- 
bution in order to sample regions that are important for 
the average. Because of the distortion, the latter method 
is in general restricted to answer only one given question, 
e.g. the value of the free energy difference, but fails to 
give further answers. This is of particular concern, if in 
addition the values of some other thermodynamic vari- 
ables are sought, for example pressure or internal energy. 
There are dynamical methods [6| that make use of the 
Jarzynski work theorem 4] . They allow to base the esti- 
mator on work values of fast, finite time, non-equilibrium 
processes connecting system with system 1. However, 
the dynamic simulation of the trajectories is typically 
very expensive. 

Six years ago, the targeted free energy perturbation 
method 0] was introduced; a promising method which 
is based on mapping equilibrium distributions close to 
each other in order to overcome the problem of insuffi- 
cient overlap, without the need to draw from biased dis- 
tributions. However, this method is hardly used in the 
literature. An obstacle might be that there is no general 
description of how to construct a suitable map. A recent 
improvement is the escorted free energy simulation Q 
which is a dynamical generalization of the targeted free 
energy perturbation. 

Any free energy difference refers to two equilibrium 
ensembles. The above mentioned methods draw only 
from one of the two ensembles and propagate the sys- 
tem in direction of the other. Insofar, they are "one- 
sided" methods. However, it is of advantage to draw 
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from both equilibrium distributions and combine the ob- 
tained "two-sided" information. Optimizing the elemen- 
tary "two-sided" estimator for free enerjgy differences re- 
sults in the acceptance ratio method [3, [l^, [llj. The 
next step of improvement is to implement a "two-sided" 
targeted free energy method that optimally employs the 
information of drawings from both equilibrium distribu- 
tions. Our aim is to combine the advantages of the accep- 
tance ratio method with the advantages of the targeted 
free energy perturbation. 

The central result of this paper is a fluctuation theo- 
rem for the distributions of generalized work values that 
is derived and presented in section IIIII From this fluc- 
tuation theorem, the desired optimal two-sided targeted 
free energy estimator follows in section IIVI In section 
IVl appropriate measures are introduced which relate the 
overlap of po with pi to the mean square errors of the 
one- and two-sided free energy estimators. In section 
IVIl a convergence criterion for the two-sided estimator 
is proposed. From section IVIII on, numerics plays an 
important part. In particular, section FVII Al deals with 
explicit numerical applications. Based on the two-sided 
targeted free energy estimator, in section IVII Bl an es- 
timator for the chemical potential of a high-density ho- 
mogeneous fluid is established and applied to a dense 
Lennard-Jones fluid. Finally, the construction and per- 
formance of suitable maps is studied. 

In order to get some notation straight, we start by 
recalling the targeted free energy perturbation method. 



According to M , the phase space density po is mapped 
to the density po. 



Po(y) = / S{y - (j){x))po{x)dx, 



which can be written as 



Pojx) 
K{x) 



(8) 



(9) 



or 



Pa{y)dy^ / PQ{x)dx, Vr C To. (10) 
0(r) Jv 



In analogy to Eq. ^ , the targeted free energy pertur- 
bation is based on the identity 



(11) 



which follows from the densities ^ and ([9]) with /S.H 
being defined by 



AH(x) -.^ Hi{^{x)) - Ha{x) - - \nK{x). (12) 

P 

Multiplying Eq. HH) by e-'^^"'^'''^ pi{(j){x))K{x) and in- 
tegrating over Fq yields the targeted free energy pertur- 
bation formula. 



II. TARGETED FREE ENERGY 
PERTURBATION 

Let Fq and Fi denote the phase spaces of the systems 
and 1, respectively. We require that F.^ contains only 
those points x for which pi{x) is non-zero. 

Mapping the phase space points of system 0, a; ^ 4>{x), 
such that the mapped phase space Fq = (/)(Fo) coincides 
with the phase space Fi and such that the mapped dis- 
tribution Po overlaps better with the canonical distribu- 
tion pi results in the targeted free energy perturbation [Tj 
where the samples are drawn effectively from poj instead. 

Following the idea of Jarzynski 0], we introduce such 
a phase space map. If Fq and Fi are diffeomorph, there 
exists a bijective and differentiable map Ai from Fq to 



-PAF 



7W:Fo->Fi, M:x-^(j){x) 

where the absolute value of the Jacobian is 

del) 



dx 



K(x) 

The inverse map reads 

M-^:Ti->To, ■.x->(f)-^{x). 



(5) 



(6) 



(7) 



-''^^(-)po(x)dx. 



(13) 



An alternative derivation is given in f^. The traditional 
free energy perturbation formula ([3]) can be viewed as a 
special case of Eq. ((T^ . The latter reduces to the former 
if is chosen to be the identity map, (j){x) = x. [This 
requires that Fi = Fq holds.] 

Now an obvious estimator for Ai<", given a sample {xk\ 
drawn from po(a;), is 



.lne-/3Aff(^), 



(14) 



which we refer to as the targeted forward estimator for 
Ai^. The convergence problem of the traditional forward 
estimator, Eq. ([3]), in the case of insufficient overlap of po 
with pi is overcome in the targeted approach by choosing 
a suitable map M for which the image po of po overlaps 
better with pi. Indeed, suppose for the moment that the 
map is chosen to be ideal, namely such that pq{x) co- 
incides with pi(a:). Then, as a consequence of Eq. (fTTjl . 
the quantity AH{x) is constant and equals AF, and the 
convergence of the targeted estimator (|14l) is immediate. 
Although the construction of such an ideal map is im- 
possible in general, the goal of approaching an ideal map 
guides the design of suitably good maps. 
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To complement the one-sided targeted estimator, a sec- 
ond perturbation formula in the "reverse" direction is 
derived from Eq. ([TTjl , 

e+^^^ = / e+^^(*"'(^»pi(y)dy, (15) 
"'ri 

leading to the definition of the targeted reverse estimator 
AFi of AF, 

AFi = +i ine+/3A^f(0-i(y). (16) 

The index 1 indicates that the set {yk} is drawn from pi. 
Using the identity map (/'(x) = a; in Eq. (|16[) gives the 
traditional reverse estimator, which is valid if Tq = Ti 
holds. 

It will prove to be beneficial to switch from phase 
space densities to one-dimensional densities which de- 
scribe the value distributions of AH{x) and Ai/ (0"^ (y)), 
cf. Eqs. and (fT5|) . This is done next and results 
in the fluctuation theorem for generalized work distribu- 
tions. 



III. FLUCTUATION THEOREM FOR 
GENERALIZED WORK DISTRIBUTIONS 

We call AH(x), x G Fq, function of the generalized 
work in forward direction and A_ff(0~^(y)), y £ Fi, func- 
tion of the generalized work in reverse direction, having in 
mind that these quantities are the functions of the actual 
physical work for special choices of the map A4 [T^ . 

The probability density p(VF|0; M) for the outcome of 
a specific value W of the generalized work in forward 
direction subject to the map M when sampled from po 
is given by 

p{W\0;M)= I S{W -AH{x))po{x)dx. (17) 

Conversely, the probability density p(T/F|l;A^) for the 
observation of a specific value W of the generalized work 
in reverse direction when sampled from pi reads 

piW\l;M)^ f S{W^AH{(t>-Hy)))pi{y)dy. (18) 

Relating the forward and reverse "work" probability den- 
sities to each other results in the fluctuation theorem 

p{W\0;M) _ ^(w-AF) 

piW\l;M) ■ ^ ^ 

This identity provides the main basis for our further 
results. It is established by multiplying Eq. (fTTj) with 

S{W — AH{x))pi{(f>{x)) and integrating with respect to 
(j){x). The left hand side yields 

/ S{W -AH{x))poWx))d(b{x) 
J</.(ro) 

^ I 5{W -AH{x))po{x)dx =p{W\0;M), (20) 
•''ro 



and the right hand side gives 

f g/3(Ai?(.)-AF) _ AJ?(a;))pi (0(a;))(i0(x) 
J0(ro) 

^^0iw-AF) f s{W-AH{rHy)))pi{y)dy 

= e^('^-^^)p(l¥|l;Al). (21) 

It is worth to emphasize that the fluctuation theorem 
(I19p is an exact identity for any differentiable, bijective 
map A4 from Fq to F i . E specially, it covers known fluc- 
tuation theorems [l3l.Tl3. [isL related to the physical 
work applied to a system that is driven externally and 
evolves in time according to some deterministic equa- 
tions of motion, e.g. those of Hamiltonian dynamics, 
Nose-Hoover dynamics or Gaussian isokinetic dynamics 

As an example, consider the time-reversible adiabatic 
evolution of a conservative system with Hamiltonian 
H\{x), depending on an externally controlled param- 
eter A (e.g. the strength of an external field). Let 
x{t) — ct){xo , t; X{-)) with x{Q) = xq he the flow of the 
Hamiltonian system which is a functional of the param- 
eter X{t) that is varied from A(0) = to A(r) = 1 ac- 
cording to some prescribed protocol that constitutes the 
forward process. The Hamiltonian flow can be used to 
define a map, A4 : x ^ (f){x) := (j){x,T; X{-)). Since the 
evolution is adiabatic and Hamiltonian, no heat is ex- 
changed, (5 = 0, and the Jacobian is identical to one, 
1^1 = 1. Consequently, the generalized work in for- 
ward direction reduces to the physical work applied to 
the system, AH{x) = Hi{<p{x)) - Ho{x) = W. For 
each forward path {x{t),X{t); < t < t} we have a 
reverse path {x^{t — t), X'^ {t — t); < t < t}, where 
the superscript T indicates that quantities that are odd 
under time reversal (such as momenta) have changed 
their sign. The generalized work in reverse direction re- 
duces to the physical work done by the system, —W — 
-[Hoi^-'iyV) - H.iy^) = -Ho{r'{y)) + H,{y) = 
AH {y)) . Starting the forward process with an ini- 
tial canonical distribution, po{x), some probability dis- 
tribution for the physical work in forward direction fol- 
lows, p(VF|0;A^) -. p^{W). Starting the reverse pro- 
cess with an initial canonical distribution, pi{y), some 
probability distribution for the physical work in reverse 
direction follows, p{W\l;M) =: p^{-W). The distri- 
butions p^{W) and p^{—W) are related to each other 
by the identity which coincides with the fluctuation 
theorem of Crooks [13]. 

From the fluctuation theorem (|19p some important in- 
equalities follow that are valid for any map A4. First 
of all we state that the targeted free energy perturba- 
tion formulas (fT3|) and (fT5|) can be regarded as a simple 
consequence of the fluctuation theorem and can be 
rewritten in terms of the generalized work distributions, 
e-/3AF ^ (e-fW)^ and e+^^^ = (e+f^^)^, where the an- 
gular brackets with subscript i denote an ensemble aver- 
age with respect to the density p(VF|i; A^), i = 0, 1. The 
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monotonicity and convexity of the exponential function 
appearing in the above averages aUows the appUcation of 
Jensen's inequahty, (e^''^) > e^^^^^. From this follows 
the fundamental inequality 



{W),<AF<{W),, 



(22) 



which shows that the values of the average work in for- 
ward and reverse direction constitute an upper and a 
lower bound on AF, respectively. 

Concerning one-sided estimates of AF, the targeted 
forward and reverse estimators (fTi|) and (|16p can be writ- 
ten AFo = _ie-/3i^° and AFi = ^e^, where the 
overbar denotes a sample average according to a sam- 
ple {VFfc} and {W^/j!^} of forward and reverse work values, 
respectively. Similarly to one finds the inequalities 
AFo < and AFi > W. 

Taking the ensemble averages (AFi 



averages 

=F;g (^IneT/Siv'^ , i = 0,1, of the one-sided esti- 
mates and applying Jensen's inequality to the averages 
of the logarithms, (hie^/^'^''\ < In < 
one obtains 



T/3AF, 



(W)^ < \^Fij^ <AF< 



AF, 



(23) 



In other words, the forward and reverse estimators are 
biased in opposite directions for any finite size N of the 
work samples, but their mean values form closer upper 
and lower bounds on AF than the values of the mean 
work do. 

So far, we were concerned with one-sided estimates of 
AF, only. However, the full power of the fluctuation 
theorem (fT9|) will develop when dealing with a two-sided 
targeted free energy estimator where a sample of forward 
and reverse work values is used simultaneously, since the 
fluctuation theorem relates the forward and reverse work 
probability densities to each other in dependence of the 
free energy difference. 

In the next section, we will drop to mention the tar- 
get map A4 explicitly in order to simplify the notation. 
For instance, we will write p{W\i), but mean p(W^|i; M), 
instead. 



TWO-SIDED TARGETED FREE-ENERGY 
ESTIMATOR 



IV. 



An important feature of the fluctuation theorem ([Tt 
is that it provides a way to answer the following ques- 
tion: Given a sample of no work values {W^*^} = 
{Wf, . . . , W^jj} in the forward direction and a second 
sample of ni work values {W^/ } = {W^, . . . , W^^} in the 
reverse direction, what would be the best estimator of 
AF that utilizes the entire two samples? 

If drawn from an ensemble that consists of forward 
and reverse work values, the elements are given by a pair 



of values {W, Y) of work and direction, where Y = 
indicates the forward and Y = I the reverse direction. 
The probability density of the pairs {W,Y) is p{W,Y). 
The probability density for the work is p{W) :~ p{W, 0) + 
p{W, 1), and that for the direction is py ■— J p{W, Y)dW . 
Bayes theorem. 



p{W\Y)pY ^p{Y\W)p{W), 
implies the "balance" equation 



(24) 



Pijp{Q\W)p{W\l)dW = pojp{l\W)p{W\Q)dW. (25) 

From the fluctuation theorem (|19p and Bayes theorem 
follows 



p{l\W) 



^ ^I3(W~C) 



with 



C = AF+iln^. 



(26) 



(27) 



Together with the normalization p(0|VF) -|-p(l|VF) — 1, 
Eq. (pS)) determines the explicit form of the conditional 
direction probabilities 



p{Y\W) 



^YI5{C-W) 

1 + eP{C-w) ' 



r 0, 1. 



(28) 



Replacing both, the ensemble averages by sample av- 
erages and the ratio ^ by the balance equation, 

Pi {p{Q\W))-^ — Po {p{1\W))q, results in the two-sided tar- 
geted free energy estimator, nip(0|M^^) = nop{l\W^), 
which reads 



"0 



fit l + e-''(^^''^ + ^'"^-^°) 



(29) 



It is worth to emphasize that this estimator is the opti- 
mal two-sided estimator, a result that is shown with a 
constraint maximum likelihood approach in Appendix □ 
A derivation of this estimator is also given by Shirts et 
al. 11] in the framework of a maximum likelihood ap- 
proach. 

If samples of ng forward and ni reverse work values 
and {W/} are given, but no further information is 
present, it is the two-sided estimator that yields the 
best estimate of the free energy difference with respect 
to the mean square error. If needed, the samples {W°} 
and {W^j } can be obtained indirectly by drawing samples 

{xi} and {yj} of po and pi and setting = AH{xi) 

AHirHyj))- 

Opposed to the one-sided estimators (fH)) and (fTB|l . 
the two-sided targeted free energy estimator (I^H]) is an 



and 
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implicit equation that needs to be solved for Ai^oi- Note 
however that the solution Ai^oi is unique. 

Let us mention a subtlety concerning the choice of 
the ratio The mixed ensemble {(W^, F)} is spec- 

ified by the mixing ration and by the conditional 
work probability densities p(W^|y). With the mixed en- 
semble we are free to choose the mixing ratio. For in- 
stance, replacing the ensemble averages in the balance 
equation (j25p by sample averages results in an estima- 
tor pip{0\W^) = pqp{1\W°) for AF that depends on the 
value of the mixing ratio. This raises the question of the 
optimal choice for As shown in the Appendix „ it is 
optimal to choose the mixing ratio equal to the sample 
ratio, Hi = A result that may be clear intuitively, 
since then the mixed ensemble reflects the actual samples 
best. 

Other free energy estimators follow, if the explicit 
expressions (|28|) and the definition of the constant C, 
Eq. (P7)) . are inserted in the balance equation ([^S]) . The 
latter can then be expressed as 

/ l+ofHC-^v) P(W\l)dW 



,^^^p{W\0)dW' 
and results in the estimator 

ni 2^j=l 



AFB(C) = C+-ln_^ 



l+C 



(30) 



(31) 



l+c 



The non-targeted version of this estimator, i.e. for M. — 
id. , is due to Bennett Q who used a variational principle 
in order to find the estimator for the free energy difference 
that minimizes the mean square error. 

Equation (j30p is an identity for any value of C, since 
with the ratio the value of C = AF -f i In 2i can 

Po ^ /3 Po ^ 

be chosen arbitrarily. However, concerning the estima- 
tor (pij) . different values of C yield different estimates. 
Bennett's choice is 



CB = AF+iln^, 



(32) 



i.e. Hi = lii which results from minimizing the mean 

Po no ' ^ 

square error ^(AFb — AF)^^, where the angular brack- 
ets denote an average over infinitely many repetitions of 
the estimation process (I5T|) with no and n\ being fixed. 
According to the Appendix „ Bennett's choice is also op- 
timal for any target map M.. 

With C = Cb, Eq. dSIl) has to be solved in a self- 
consistent manner which is tantamount to solve__the two- 
sided targeted estimator (^5)1 . In other words, AFb(Cb) 
is the unique root AFqi of Eq. 



V. OVERLAP MEASURES AND MEAN 
SQUARE ERRORS 

In this section we introduce measures for the over- 
lap of Po with pi, or, equivalently, of p{yV\^\M) with 
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FIG. 1: Targeted work probability distributions for the ex- 
pansion/contraction of a cavity in an ideal gas and the associ- 
ated overlap distribution. The up (down) triangles display the 
normalized histogram of a sample of forward (reverse) work 
values. The smooth solid curves are the exact analytic work 
distributions p{W\Q\M) (right) and p{yV\\\M) (left), and 
the dashed curve shows their overlap distribution pc>i(VF|A1). 
The straight vertical lines show the values of the targeted es- 
timates of on the abscissa. From left to right: the reverse, 
the two-sided (which is indistinguishable from the exact ana- 
lytic value) and the forward estimate. 



p(IF|l; A^) and relate them to the mean square error of 
one- and two-sided estimators. 

The estimators (Hi]), ([H]) and (gHl) are subject to 
both, bias and variance. Taking both errors into ac- 
count results in the mean square error. Let us consider 
the mean square errors of the one-sided targeted esti- 
mators first. They read msco :— (^{AFq — AF)^^ = 

(^(inc^^f'^"^'^'^))^^ in forward direction, and anal- 
ogously in backward direction. In forward direction, 
it can be quantified by expanding the logarithm into 
a power series about the mean value of its argument. 



I 



1, and neglecting terms of higher or- 



der in -i-, which gives 



-P{W-l^F) 



(33) 



Equation is valid for a sufficiently large sample size 
N (large N hmit) With the use of the fluctuation 

theorem (|19p . the variance appearing on the right hand 

side of Eq. ^ can be written ^(e^''^^"'^-^) ~ -^)^) 
^^^-p^w^AF^^^^ _ 1 > ^-/3((w),-Af) _ ^ This yields the 
inequality 



(34) 
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In the same manner as above the inequahty 



(3 msei > 



1 



(35) 



is obtained for the mean square error msei of the reverse 
estimator AFi. 

The inequahties (I34p and ()35|) specify the minimum 
sample size N that is required to obtain a forward 
and reverse estimate AF, respectively, whose root mean 
square error -^mse is not larger than kT. Namely, 
TV > e'^^'^^~^^^i^ is required for a forward, and N > 
Q0i{w)Q-AF) ^ reverse estimate. Similar expressions 
are found in Ref. [3]. Since the required sample size N 
depends exponentially on the dissipation, it is good to 
choose a target map M which reduces the dissipation in 
the opposite direction. 

The dissipation is related to the overlap of po with 
pi. The overlap of two probability densities na{z) and 
7rb(z) of a random variable z can be quantified with the 
Kullback-Leibler divergence 



TTa{z)\Q. -rrdz, 

T^byz) 



(36) 



a positive semidefinite measure that yields zero if and 
only if -Ka is identical to TTf,. Applied to the densities 
Pi and Po, the Kullback-Leibler divergence turns out 
to be identical with the Kullback-Leibler divergence of 
p{W\\]M.) with p{W\Q;M) and results in the general- 
ized dissipated work in reverse direction, 

D{p^\\po) = D{p{W\V,M) \\p{W\Q-M)) 

= mF-{W),), (37) 

which is established with the use of Eqs. pTjl and 
and the fluctuation theorem (jl9p . Similarly, we have 

D{pa\\p^) = D[p[W\Q-M) |b(W^|l; A^)) 

= p{{W),-AF). (38) 

For the one-sided targeted free energy estimators this 
means that choosing a target map which reduces the dis- 
sipation in the opposite direction is the same as choosing 
a target map which enhances the overlap of po with pi . 

Now, we proceed with the overlap measure and the 
mean square error of the two-sided free energy estimator 
(|29p . In order to keep the notation simple, we assume 
that the samples of forward and reverse work values are 
of equal size, no = ni = N. (A generalization to no ^ n\ 
is straightforward possible, but not given in this paper.) 

Consider the overlap density poi(W^I-A^), 



Pom\M) 



1 p{wpM)v{^\^-M) 



A^x p(W^|0;7W) +]5(W^|1;M)' 
where the normalization constant A^y reads 



(39) 



p{W%M)p{W\\\M) 



p{w\^-M) 



■dW 



-p{W\l;M) 

poiy)piiy) 



po{y) + Piiy) 



dy. (40) 



Aoi is a measure for the overlap area of the distributions 
and takes its maximum value ^ in case of coincidence. 
Using the fluctuation theorem ^T9^ , the two-sided overlap 
measure can be written 



Aoi 



1 



1 + Q0{AF-W) 



1 



I ^ Q-P{AF-W) 



(41) 



Comparing Eq. (|1T|) with the two-sided targeted free en- 
ergy estimator (p9)) . one sees that the two-sided targeted 
free energy estimation method readily estimates the two- 
sided overlap measure. The accuracy of the estimate de- 
pends on how good the sampled work values reach into 
the main part of the overlap distribution po\{W\A4). By 
construction, the overlap region is sampled far earlier 
than the further distant tail that lies in the peak of the 
other distribution, cf. Fig. [T] This is the reason why 
the two-sided estimator is superior if compared to the 
one-sided estimators. 

In the large N limit the mean square error mseB(A'^) = 



(0 



^AFqi — AFj j of the two-sided estimator can be ex- 
pressed in terms of the overlap measure and reads 



mseB(iV) = 1(^-2 



(42) 



cf . [Tl[ . Note that if an estimated value Ao\ is plugged 
in, this formula is valid in the limit of large N only, but it 
is not clear a priori when this limit is reached. Therefore, 
we develop a simple convergence criterion for the two- 
sided estimate. 



VI. CONVERGENCE 

In this section, a measure for the convergence of two- 
sided estimate is developed, again for the special case 
no = ni = N . First, we define the estimate A^i of the 
overlap measure Ao\ with 



1 ^ 



1 



(43) 



which is equal to X]i=i 



stand the estimate A_Foi to be obtained according to 
(j29p with the same samples of forward and reverse work 
values. Since the accuracy of the estimated value Ao\ 
is unknown, we need an additional quantity to compare 
with. 

Another expression for the overlap measure is 




(44) 



which can be verified with the fluctuation theorem ((19)) . 
Based on Eq. (jH]), we define the overlap estimator of 
second order 



1 ^ / 1 

1 ^ 



1 



I ^ Q-f3{AFoi-W°) 



(45) 



Because AFqi converges to AF, both, Aqi and A^^[' 
converge to Ao\ in the limit N oo. However, the sec- 
ond order estimator Aq"' converges slower and is for small 
N typically much smaller than Aqi, since the main con- 
tributions to the averages appearing in Eq. (jH]) result 
from work values that lie somewhat further in the tails 
of the work distributions. 

We use the relative difference 



a{N) 



A. 



ol 



A (11) 
^ol 



(46) 



lol 



to quantify the convergence of the two-sided estimate 
AFoi, where Ad, A^"' and A_Foi are understood to be 
calculated with the same two samples of forward and re- 
verse work values. 

From Eqs. (gg), dS]), and (HH) follows that < < 

2A01 holds. Hence, the convergence measure a(N) is 
bounded by 



-1 < a{N) < 1 



(47) 



for any N. A necessary convergence condition is a{N) — > 
0. This means that only if a{N) is close to zero, the 
two-sided overlap estimators can have converged. Typi- 
cally, a{N) being close to zero is also a sufficient conver- 
gence condition. Hence, if a{N) is close to zero, the mean 
square error of AFqi is given by Eq. (|42)l with Ad « Ao\. 
As can be seen from Eq. (|42p , the mean square error and 
in turn the variance and the bias are reduced by both, 
by taking a larger sample size N and by choosing a map 
M that enhances the overlap of pq with pi . 

With the targeted free energy estimators at hand, to- 
gether with their mean square errors, we are now ready 
to compute free energy differences numerically. 



VII. NUMERICAL EXAMPLES 

We investigate two numerical applications. One is the 
free energy difference of a fluid subject to the expansion 
of a cavity which allows the comparison with published 
results [3] . The other is the chemical potential of a fluid 
in the high density regime. 

Beneath an ideal gas, the fluid is chosen to be a 
Lennard-Jones fluid with pairwise interaction 



Virki) 




(48) 



\ 



'^R^ 1 









FIG. 2: The geometric setup. 



where rki is the distance between the k-th and 1-th par- 
ticle, rki = \rk — ri\. The parameters used are those of 
argon, a = 3.542 A and e/fc 93.3 K 0. 

In all applications, the samples from the densities po 
and pi are simulated with the Metropolis algorithm [l9[ . 
In order to simulate macroscopic behavior with a small 
number Np of particles, periodic boundary conditions 
and the minimum image convention Q are used. Pair- 
wise interactions are truncated at half of the box length 
Rhox — L/2, but are not shifted, and the appropriate 
cut-off corrections are applied Q . 



A. Expansion of a cavity in a fluid 

The expansion of a cavity in a fluid is given by the 
following setup: Consider a fluid of Np point molecules 
with pairwise interaction V{rki) confined in a cubic box 
of side length 2i?box, but excluded from a sphere of ra- 
dius R < Rhox, compare with Fig. [21 Both, the box 
and the sphere are centered at the origin r = 0. A con- 
figurational microstate of the system is given by a set 
X — (ri, . . . , r^Tp) of particle positions r^. Growing the 
sphere from R = Rq to R — Ri decreases the volume ac- 
cessible to the particles and the fiuid is compressed. We 
are interested in the increase of free energy AF subject 
to the compression of the fluid. Since the kinetic contri- 
bution to the free energy is additive and independent of 
R, the difference AF depends only on the conflgurational 
part of the Hamiltonian. The latter reads 

j:Virk,ri) if xeT,, 
H,{x) ={k<i ^ (49) 

00 \i X ^ r,: 

with I = 0, 1. Fq and Fi denote the accessible parts 
of configuration space of the system {R = Rq) and 1 
{R = i?i), respectively. We assume that Rq < Ri holds 
which implies Fi C Fq. 
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Drawing a sample {xk} from po and applying the tra- 
ditional forward estimator ^ results in the following: 
^-pAH{xk) takes the values one and zero depending on 
whether Xk G Fi or not, i.e. whether the region between 
the two spheres of radius Rq and i?i is found vacant of 
particles or not. A comparison with Eq. ([3]) reveals that 
g-/3A_F jg probability for the spherical shell being ob- 
served devoid of particles 0- Hence, the rate of conver- 
gence of e^P^H decreases with the latter probability and 
will in general be poor. 

Conversely, drawing a sample from pi and apply- 
ing the traditional reverse estimator AFj^ = ^el^^^^y^ 
(Eq. ([TH]) with (j){x) = x) figures out to be invalid, be- 
cause the term Qf^^^'^y*') takes always the value one. In 
concequence, the traditional reverse estimator is incon- 
sistent. The deeper reason for this is that Fi C Fq holds: 
Eq. ^ is valid only for a; S Fi. By the same reason, the 
traditional two-sided estimator is invalid, too. 

The mentioned shortcomings are avoided with a well 
chosen target map. Consider mapping each particle sep- 
arately according to 



(l){x) = ip{ri) 



(50) 



where — \rk\ is the distance of the k-th particle with 
respect to the origin, and ip : (i?o,-Rmax] (-Ri,-Rmax] 
is a bijective and piecewise smooth radial mapping func- 
tion. In order not to map particles out of the confining 
box, it is required that tl;{r) = r holds for r > i?box- The 
Jacobian for the radial map (|50|) reads 



dx\ -^J- 



(51) 



(This formula is immediately clear when changing to po- 
lar coordinates) We use the map of Ref. [7] which is 
designed to uniformly compress the volume of the shell 
Ro < r < i?box to the volume of the shell Ri < r < i?box- 
Thus, for r € {Rq, Rhox] the radial mapping function ip{r) 
is defined by 



i^irf -Rl = c{r^ - i?; 



(52) 



with the compression factor c = {R'^^^ — Rf) / {R^ox ^ -^o) • 
According to Eq. (I51|) . we have liiK(x) = i^ix) Inc, where 
I'ix) is the number of particles in the shell Rq < r < 

-^box- 



Ideal Gas 



X reads AH{x) = —^i/{x)\nc and takes discrete values 
only, as v{x) = n holds with n S {0, 1, . . . , A^p}. Con- 
sequently, the probability p{Wn\Q]M.) of observing the 
work Wn = — ^ In c in forward direction is binomial. 



p{Wn\Q;M) 



Nr 



qo) 



(53) 



where qo = ^tt{R^^^ — Rq)/Vo is the probability of any 
fixed particle to be found in the shell Rq < r < i?box- 
In analogy, the probability distribution p{Wn\l] M) for 
observing the work W = Wn in reverse direction is given 
by replacing the index with 1 in (j53p . Finally, the work 
probability distributions (rather then the densities) obey 
the fluctuation theorem (fT9| for any n = 0, 1, . . . , Ap, 



piWn\0;M) ^ 1_ 
p{Wn\l;M) c« 



Nr, 



„/3(W„-AF) 



(54) 



A simple numerical evaluation highlights the conver- 
gence properties. Choosing the parameters to be 2i?box — 
22.28 A, Ro = 7 A, Ri = 10 A, and Ap = 125 (/? arbi- 
trary) , the free energy difference takes the value (3AF = 

42.1064. Because e"'''^^'^^^ can take only the numbers 
zero and one, the probability of observing a configura- 
tion X with non- vanishing contribution in the traditional 
forward estimator of AF is e"'^'^^ « 10~^^. Hence, in 
practice it is impossible to use the traditional method 
successfully, since it would require at least A'p-lO^^ Monte 
Carlo trial moves. However, the targeted approach al- 
ready gives reasonable estimates with a sample size of 
just a few thousands. Figure [1] shows estimates of the tar- 
geted work probability distributions for samples of size 
A = 10** from po and pi each. While the forward distri- 
bution p{W\0;M) is obviously well sampled in the cen- 
tral region, the sampling size is too small in order to reach 
the small values of PW where the reverse distribution 
p(M^|l;Al) is peaked. Exactly the latter values would 
be required for an accurate exponential average in the 
targeted forward estimator Eq. (|14p . Therefore, the tar- 
geted forward estimate of AF is still inaccurate; it yields 
PAFo — 45.0 ± 0.3. The same is true for the targeted 
reverse estimate (IT51) which gives /3AFi = 41.3 ± 0.5. 
The errors are calculated using root mean squares and 
propagation of uncertainty. A more accurate estimate 
follows from the targeted two-sided estimator which 

yields /3AFoi = 42.1 ±0.1 (no = m = A^). This is clear, 
as for the two-sided estimate it is sufficient yet that the 
forward and reverse work- values sample the region where 
the overlap distribution poi{W\M), Eq. ([55]) . is peaked, 
which is obviously the case, cf. Fig. [T] 



As a first illustrative and exact solvable example we 
choose the fluid to be an ideal gas, V{rki) — 0. In this 
case the free energy difference is solely determined by the 
ratio of the conflned volume Vi = 8R^^^ — |7ri?f , i = 0, 1, 
and is given by (3AF = — Ap In {Vi/Vq). Using the radial 
map (|52p . the work in forward direction as a function of 



2. Lennard-Jones fluid 

We now focus on particles with Lennard-Jones inter- 
action The parameters are chosen to coincide with 
those of Ref. 0], i.e. 2i?box = 22.28 A, Rq = 9.209 A, 
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FIG. 3: Free energy estimates for the expansion of a cavity in 
a Lennard- Jones fluid. Shown are average values of traditional 
and targeted estimates of AF in dependence of the sample size 
A'', with an errorbar of one standard-deviation. In order to 
distinguish the data points those corresponding to targeted 
estimates are shifted to the right and are spread, whereas 
those corresponding to traditional estimates are shifted to 
the left. For example, all four data points in the vicinity 
of = 10000 refer to iV = 10000. The dashed horizontal line 
represents a targeted two-sided estimate with A'^ = 7.5 ■ 10^, 
see Table H 



Ri = 9.386 A, Np = 125, and T = 300 K. In Lennard- 
Jones units, the reduced densities p* = Np/Vi ■ cr^ of the 
systems (i? = Rq) and 1 {R = Ri) are Pq = 0.713 and 
pI = 0.731, respectively, and T* = l/(/3e) = 3.215 holds 
for both. If we had an ideal gas, the probabiHty of ob- 
serving the space between the spheres of radius Rq and 
Ri to be vacant of particles would be {Vi/Vq)'^'' = 0.044. 
Because of the strong repulsive part, this probability is 
much smaller in case of a dense Lennard- Jones fluid. 

We generate samples of po{x) and pi{x) with a 
Metropolis Monte Carlo simulation. Each run starts with 
1000 equilibration sweeps, followed by the production 
run. In the production run the configurational microstate 
X is being sampled every 4-th sweep only in order to re- 
duce correlations between successive samples. The use of 
decorrelated data is of particular importance for the self- 
consistent two-sided estimate Ai^oi: because it depends 
intrinsically on the ratio — of the numbers of uncorre- 
Zated samples, cf. Eq. ((29| . 

Fig. [3] gives an overview of independent runs with dif- 
ferent sample sizes iV, where the one- and two-sided 
targeted estimators can be compared with each other 
and with the traditional forward estimator. Displayed 

is the estimated mean AF(7V) in dependence of the sam- 
ple size iV. The error bars reflect the estimatet stan- 

AF(iV)) . Each mean and 



each standard-deviation is estimated using z(N^ inde- 
pendent estimates AF{N). In ascending order of N, 
z{N) reads 400, 100, 20, 7. For the two-sided estimates, 
no — ni — N is used and Eq. (^5]) is solved. 

Note that the theoretical mean of traditional forward 
estimates of AF is infinite for any finite N, because of 
the finite probability of observing a sequence of length N 
of solely vanishing contributions to the exponential aver- 

RA u — -trad 

age e"''^^. Strictly spoken, the estimator /SAFq = 
— /3~^ Ine"'''^^ is not well defined, because Fi C Fq. 
Nevertheless, in Fig.[3]there are two finite observed mean 
values of traditional forward estimates displayed, what by 
no means is a contradiction. Infinite values are observed 
in the cases where N < 10^ holds. This is symbolized 
by the rising dashed line. The mentioned ill-definiteness 
of the traditional estimator is removed by using the map 
([52|) . Figure [3] shows that all three targeted estimators 
are consistent even for small N in the sense that the error 
bars overlap. Whereas the targeted forward and reverse 
estimators show to be decreasingly biased with increasing 
N, the targeted two-sided estimator does not show any 
noticeable bias at all. This example demonstrates how 
worth it can be to take all three estimators, forward, 
reverse, and two-sided, into account. The one-sided es- 
timators are biased in opposite directions and may serve 
as upper and lower bounds for AF, Eq. whereas the 
two-sided is placed in between the one-sided. 

We conclude this example with explicit estimates ob- 
tained from a single run with N = 750000, that are sum- 
marized in Table [H The errors are derived using block 
averages [2]| and propagation of uncertainty. 

TABLE I: Cavity in a Lennard- Jones fluid. Estimated free 
energy differences /3AF for the expansion of a cavity, using 
targeted and traditional estimators. N = 7.5 • 10^. 



Method 


PAF 


traditional forward 


7.500 


± 


0.050 


targeted forward 


7.442 


± 


0.005 


targeted two-sided 


7.439 


± 


0.002 


targeted reverse 


7.420 


± 


0.010 



B. Chemical potential of a homogeneous fluid 

Consider a fluid of Np particles confined within a cu- 
bic box of volume 14 — (2i?box)'^ with pairwisc inter- 
action V{rij). The configurational Hamiltonian for the 
A^p-particle system at a; = (ri, . . . , r^rp) reads 



HN^x) = J2vir,,). 



(55) 



dard deviation 



The configurational density for the A'p-system is given by 
p^p(x) = e-''«".(-)/ZArp, (56) 
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with the partition function Z^^ = J e'~^^"i''^^^dx. Now 
consider one particle is added: the position of this new 
particle may be r^Vp+i- The equilibrium density of the 
{Np + l)-particle system reads 



l(an,rA 



(57) 



Taking the ratio of the densities (l56lj and ([57|) leads to 
Widoms particle insertion method |22l | for estimating the 
excess chemical potential of the iVp-system, defined as 
the excess of the chemical potential /i to that of an ideal 
gas at the same temperature and density. For sufficiently 
large Np, fj,''^ can be approximated with 



P Zn^c 



(58) 



Turning the tables, we use Eq. ((58)) to be the definition of 
the quantity The particle insertion method inserts 
at a random position an extra particle to the A^p-system 
and measures the increase of energy that results from 
this particle. Since we consider a homogeneous fluid, we 
may as well fix the position of insertion arbitrarily, for 
instance at the origin, what is done in the following. We 
define system 1 through the configuration-space density 
Pi{x) at follows: 

Pi{x)^Vc j 5(rjVp+i)PA'p+i(a;,rjVp+i)drjVp+i. (59) 

The factor Vc ensures normalization. Written in the usual 
form pi{x) ~ e~^-'^^^-^^ /Zi, we have 



H,{x)^HN,{x)+Y,V{rk) 



(60) 



fc=i 



and Zi = Zn^^i/Vc- System 1 can be understood as 
an equilibrium system of Np interacting particles in the 

A'p 

external potential ^{''^k)i due to one extra particle 
fc=i 

fixed at the origin r = 0. Further, we identify system 
with the A'p-particle system and rewrite 

Po{x) = PNj,{x), Ho{x) = Hn^{x) (61) 

and Zq = Zn^. The ratio of po and pi has the familiar 
form of Eq. with AF being identical to p,^^, 



(62) 



The energy difference AH{x) = Hi{x) — Ho{x) is the 
increase of energy due to an added particle at the origin 
r = 0, 



AH{x)=J2^ir,) 



(63) 



Assume a finite potential V{r) for non-vanishing r (i.e. 
no hard-core potential) , but with a strong repulsive part 
for 7- — > (a so-called soft-core potential), e.g. a Lennard- 
Jones potential. In this case, the configuration spaces of 
system and 1 conicide, i.e. Fg = Fi. Thus a tradi- 
tional estimate of is in principle valid in both di- 
rections, forward and reverse. In forward direction we 
have the equivalent to the particle insertion method [2^ , 

Pp'^^o = — In e~'''^^(^) , but with fixed position of in- 
sertion r = 0. Here x is drawn from po and we will 
typically find a particle in a sphere of radius r centered 
at the origin, f can roughly be estimated by the mean 
next-neighbor distance (Vc/Np)^^'^ of an ideal gas. The 
dominant contributions to the exponential average come 
from realizations x that resemble typical realizations of 
system 1 [l3|. However, typical realizations x of the sys- 
tem 1 do not contain any particle within a sphere of some 
radius rhc centered at the origin, because of the extra par- 
ticle fixed at the origin and the strong repulsive part of 
the interaction, rhc may be regarded as a temperature- 
dependent effective hard-core radius of the interaction 
l3V{r). We conclude that the insertion method is accu- 
rate and fast convergent only, if r^^ <<C f^, i.e. for low 
densities. Concerning the reverse traditional estimator 

f3p'^^i = Ine^^-f^(i'), where y is drawn from pi, the 
same argumentation reveals the impossibility of obtain- 
ing an accurate estimate in this way. Effectively, the par- 
ticles of system 1 cannot access the vicinity of the origin, 
no matter how large the sample size will be. In this sense, 
Fi can be substituted with an effective F^^'^ C Fi = Fq, 
implying that the traditional reverse estimator tends to 
be inconsistent. 



1. Constructing a map 

Again, we use a radial map the particle positions, 
0(x) = (Ri, . . . ,RArp), with Rfc = '(/)(r'fe)^. In searching 
a suitable radial mapping function i^{r), we are guided 
by the mean radial properties of the systems themselves. 
The radial probability density go{r) of finding a particle 
in distance r from origin in system is 



50 



and that for system 1 is 



1 f 

" ]V~ ^ J '^'^^'^ ~ r)poix)dx, 



(64) 



rk - r)pi{x)dx. 



(65) 



k=l 



Due to the interaction with the extra particle fixed at the 
origin in system 1, gi{r) will in general be quite differ- 
ent from go{r). The latter is related to a homogeneous 
fluid and is proportional to (for r < Rbox), whereas 
the former refers to an inhomogeneous one and is pro- 
portional to r^e"^^*^""^ in the limit r ^ [l^]. For large 
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FIG. 4: The radial densities gi{r) and go{r) for a dense 
Lennard- Jones fluid (p* = 0.9 and T* = 1.2), estimated from 
simulated data. The ratio gi{r)/go{r) equals the radial dis- 
tribution function rdf(r). 



FIG. 5: Simulated radial mapping function tp*{r) for a dense 
Lennard- Jones fluid (solid). maps the radial density goir) 
to gi{r), cf. Fig. U For the ideal gas, tp* is the identity map 
(dashed) . 



r, however, the influence of the extra particle vanishes 
and gi{r) — > goir). Evaluation of the definition (jM)) of 
go yields 



9o{r) = yMr), 



(66) 



where ho{r) accounts for the decay of volume in the 
corners of the confining box and is given by hQ{r) = 



J J ^^^^ sin 9 d(j)d9. The integration extends over the frac- 
tion of surface A{r) of a sphere with radius r that lies 
inside the confining box. Note that hQ{r) — Att for 
r < i?box- In contrast to go, gi depends on the interac- 
tion V{r). After some transformations of the right-hand 
side of Eq. ([^ . gi can be written 



51 W 



2 ~l3V{r) 



Vc 



-hiir). 



(67) 



The function hi{r) can be written (cf. [22]) 



hi{r)^e^^^'^hoir) 



Np-l . 

~P T.^ (y(rfc)+V(|r,-r„p|)) 



(Np-l) 

(68) 



where the angular brackets denote an average with a Np— 
1 particle density according to Eq. (|56p and the vector 
Yjs!^ is arbitrarily fixed, but of magnitude r. Further, the 
approximation Zfq^-i/ Zfq^^i « is used. 

We note that the ratio of gi with go yields the well- 
known radial distribution function rdjir) of the Np + 1- 
particle fluid. 



rdflr) 



9o{r)' 



(69) 



Fig. |4] shows estimates of 50 Siiid gi for a dense Lennard- 
Jones fluid with parameter values of argon (see below 
Eq. ([48|l ). obtained from Monte Carlo simulations. 

Now define a function ip*ir) by the requiring that it 
maps the mean radial behavior of system to that of 
system 1. This is done by demanding 



V>*(r) 



which yields 



gi{t)dt= / go{t)dt, 



d^* ^ gojr) 
dr gi{^*{r))' 



(70) 



(71) 



In the limiting case of an ideal gas, gi — go holds and the 
map becomes an identity, ip*{r) = r. Of practical interest 
are the cases where gi is unknown and thus Eq. (|70p can 
not be used to derive '4>*{r). However, the function ip* 
can be estimated with Monte Carlo simulations without 
knowledge of gi and go as follows. 

Take a sufficiently large amount n of samples Xj — 
(rij, . . . , rjVpj), j = 1, . . . , n, drawn from po(x) together 
with the same number of samples Uj — (Rij, . . . , 'Rnpj) 
drawn from pi{y) Calculate the distances to the ori- 
gin Vij — \rij\ and Rij = \Rij\ and combine all r^j 
to the set {ra,ri,,rc ■ ■ ■), as well as all Rij to the set 
{Ra, Rb, Rc, ■ ■ ■)■ Provided in both sets the elements are 
ordered ascending, ra < rj, < < . ■ ■ and Ra < Rb ^ 
Rc < • • ■ , "0* is simulated by constructing a one to one 
correspondence Va ^ Ra, fb ^ Rb, ■ ■ ■ and estimating 
^*{fa) to be Ra, a = a,b, c, . . . . In effect, we have drawn 
the Tq, and Ra from the densities 50 (^) and gi{r), respec- 
tively, and have established a one-to-one correspondence 
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between the ordered samples. We refer to this scheme as 
the simulation of the map of go to gi. 

The solid curve shown in Fig.[5]is the result of a simula- 
tion of the function ijj* for a Lennard-Jones fluid (param- 
eters of argon, p* — 0.9, T* — 1.2). The corresponding 
densities go and gi are plotted in Fig.[4l Noticeable is the 
sudden "start" of ip* with a value of roughly a. This is 
due to the strong repulsive part of the interaction, that 
keeps particles in system 1 approximately a distance a 
away from the origin. Therefore, the behavior of ip*{r) 
for r ^ is not obtainable from finite-time simulations. 
However, the definition of ip* implies that for any soft- 
core potential ip*{0) — holds. To model V'* for small r, 
the limit gi{r) ar^e'^^^'^^'in/Vc can be used, where 
a is a constant. Thus, Eq. ([70)1 can be written 



3a 



/2 - 

r e 



(72) 



in the limit r — > 0, with -0*" being the inverse of i/j*. 
The constant a is in general unknown, but here it can 
be chosen such that is fits continuously to the simulated 
part of ^/i*^^. 

When the function ip* is used in the configuration 
space map </> according to Eq. (|50p . then, by definition of 
"0*, the radial density go{r) of the mapped distribution 
Po{x), Eq. ([8]), is identical to the one of pi{x): 

^o(^) '-=^^2 [ Smk\-R)poi<P)dcl> 

p I. 



Sd'iri) ~ R)po{x)dx 
5{'ijj{r) — R)S{ri — r)po{x)dxdr 



(5(0(r) - R)goir)dr 
9i{R)- 



(73) 



Therefore we expect that the overlap of the mapped dis- 
tribution pq with pi is larger than the overlap of the 
unmapped distribution po with pi. However, it must be 
noted that the use of ?/;* in the map (j) is in general valid 
only in the limit of an infinite large system {N^ Vc oo; 
N/Vc = const), since we have not yet taken into account 
the requirement that particles may not be mapped out 
of the confining box. If -Rbox is chosen large enough, this 
might not be a serious problem, cf. Fig. [51 



2. Application of the radial map ip* 

We now apply tp* and estimate the chemical potential 
of a dense Lennard-Jones fluid {p* = 0.9, T* = 1.2, pa- 
rameters of argon) with i?box — 3.1056 a and Np — 216 
particles. Configurations are drawn from po and pi us- 
ing a Metropolis algorithm with 7 decorrelation sweeps 



40 



30 



20 



10 



-10 




100 



1000 



10000 



100000 



N 



FIG. 6: Targeted estimates of the excess chemical potential 
p^"" of a dense Lennard-Jones fluid (p* = 0.9, T* = 1.2) 
compared to traditional estimates. 



between successive drawings. From every drawn config- 
uration there results one value for the traditional work 
and one for the work related to the map. The usual 
cut-off corrections @ are applied. To avoid mapping 
particles out of the confining box, we simulate the map 
on the interval < r < Rbox subject to the condition 
V'*(^box) = -Rbox and use 'il^*{r) = r for r > i?box- The 
derivatives of -0* and -0*"^ are obtained numerically. For 
the calculation of the work values in the simulation, the 
functions 0*(r) and ip*~^{r) as well as their derivatives 
are discretized in steps Ar with i?box/Ar = 11 • 10*. 

A comparison of the behavior of the targeted and tra- 
ditional forward, reverse and two-sided estimators in de- 
pendence of the sample size N is given in Fig. [5) (for 
the two-sided estimators uq — ni — N is used). Each 
data point represents the average value of z{N) inde- 
pendent estimates p'^^{N). The error bars display one 
standard deviation. z{N) reads z{N) = 450,250,45,5 
for N = 100, 1000, 10000, 100000, respectively. 

As can be seen from Fig. [51 the traditional one-sided 
estimators behave quite different. The reverse estima- 
tor converges extremely slow in comparison to the for- 
ward estimator. This can be understood by comparing 
the average work values in forward (i=0) and reverse 
(i=l) direction, see Table HIl Since the absolute value of 



TABLE II: Estimatet values of the mean forward and reverse 
work, obtained from N — 10^ sampled work values each. 





0WO 


pW 


traditional 


lO^o 


-9.8 


targeted 


10^ 





f3AF = f3p^^ is small, the traditional reverse estimator 
practically never converges, whereas for an accurate tra- 
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FIG. 7: Members of the family of radial mapping functions 
-i/jm for the Lennard- Jones potential. For m ^ 0, V™ con- 
verges to the identity map ^o{r) = r. 
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FIG. 8: Forward (right) and reverse (left) work distributions 
of a Lennard- Jones fluid (p* — 0.9, T* — 1.2) for different 
radial mapping functions ■(/),„. m — results in the traditional 
work distributions. 



ditional forward estimate we need some 10^ work values, 
cf. Eqs. p4)) and ((35)) . In contrast, the targeted one-sided 
estimators both show a similar convergence behaviour if 
compared with each other. However, the convergence is 
slow. 

The two-sided estimators converge much faster, in 
particular, the targeted two-sided estimator converges 
fastest, see Fig. [6l The convergence of the latter was 
checked with the convergence measure a{N), Eq. pS)) . 
A moderate gain in precision for the two-sided targeted 
estimator is found if compared to the precision of the two- 
sided traditional estimator which can be quantified with 
the overlap measure ioi Namely, ioi = LS-IO"" for 
the targeted case, and Ao\ = 1.1 • 10^'* for the traditional 
case. 

We also studied other radial mapping functions ip. 
Some of them turned out to give much better results and 
are easier to deal with. 



3. Other radial mapping functions 

The radial mapping function ijj* was obtained from 
simulations, beacause the distribution gi{r) is analyti- 
cally unknown. However, we are free to use any radial 
mapping function ipir) and can thus in turn fix the func- 
tion gi appearing in Eq. (j70p . To do this, we introduce 
the normalized, positive definite function g'i{r), 

= !:!e-'5(VM+QM), re[0,Rto.]. (74) 

Cl 

Q{r) is an arbitrary finite function over {0-,Rbox] and 
Cl = jj*''""' r^e"'^^^*^''-'"'"'^*^ dr a normalization constant. 



Further, let (7o(r) be a normalized quadratic density, 

^2 

5oW = -, re[0.,Rbox], (75) 

Co 

with Co = RU3. 

The general (monotonically increasing) radial mapping 
function ip(r) can be expressed in terms of the equation 

i/i(r) r 

I g[{t)dt^ I g'o{t)dt (76) 



for r E [0, i?box]- For r > i?box it shall be understood 
that ip{r) = r. Given the function Q{r), ip and ip^'^ 
are determined uniquely by Eq. (j76p . An advantage of 
defining -0 with equation (|76p is that the derivative dip/dr 
is given in terms of V and Q, 

dr Tp{r)'^ ' 

with / = — ^ In 1^ . Using Tp in the configuration space 
map ipi^) according to Eq. (|50p yields the work function 

i<j 

- {W(^0)-/}- (78) 

n<i?box 

Here is understood to be = ipii'i)^^ ^^'^ the sum 
in the second line extends only over those particles for 
which r < f?box holds. Note that the potential-energy 
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FIG. 9: The average generalized work {W)^ and {W)^ in 
forward and reverse direction, respectively, and the two-sided 
overlap measure A^i in dependence of the mapping parameter 
m. The forward dissipation is reduced up to 18 orders of 
magnitude if compared with the traditional dissipation, cf. 
Tab. [Ill Among the one-sided estimators the best is found 
for m = and in forward direction. The optimal two-sided 
estimator results from using the m that maximizes Aoi- 



FIG. 10: Two-sided estimates of as function of the map- 
ping parameter m out of no = ni = A'^ = 10* work values in 
both directions for each m. The value of the traditional esti- 
mate (m = 0) is /i™oi = 4.0 ± 2.0. The error bars show the 
square root of the estimated mean square errors mseoi. For 
comparison, the dashed line represents a two-sided estimate 
with N = 10*^ and m = 0.0005 (standard-deviation 0.03). 



contribution of the extra particle fixed at the origin is 
eliminated in the work function, due to the definition 
of V'- However, in Eq. ([78|) we have already assumed 
V{r) to be cut of at r = i?box, i-e. V{r) = for r > 
-Rbox- Otherwise we had to add X^r >_r — 

4- A family of maps 

We now introduce a family {ijjm} of radial mapping 
functions, where each member ^p^n is defined by Eq. (j76p 
with the choice 

Q{r) = {m-l)V{r) (79) 

in the expression (|74p . Useful maps are obtained for 
m € [0,1]. Fig. [7] depicts some members of the fam- 
ily {'(pm} for Lennard- Jones interaction (with parameters 
of argon). Again, we apply these functions discretized 
(in steps Ar with i?box/Ar = 11 • 10"*) to the calcula- 
tion of the targeted forward and reverse work AH{x) 
and AH{(t)-'^{x)). Any pair of forward and reverse tar- 
geted work distributions belonging to the same value of 
m obeys the fluctuation theorem p^ . In particular they 
cross at W ~ {AF = here). Nevertheless, the 
shape of these distributions is sensitive to the value of 
m. This is demonstrated in Fig. [51 There, normalized 
histograms of f3W are shown. They result from 10^ work 
values for per m and per direction. We emphasize that 



all of the targeted forward (reverse) work values were ob- 
tained with one sample of = 10^ configurations x from 
Po (Pi)- 

Instructive is the comparison of the mean work (W) 
related to different values of m. In Fig. [5] estimated val- 
ues of mean work are shown in dependence of m. From 
these values one sees that the dissipation is minimal for 
TO = in the reverse direction. Therefore, the best one- 
sided targeted estimate of ^™ among the family {4>m} 
is obtained with to = in forward direction, i.e. with 
the traditional particle insertion. However, the same is 
not true for two-sided estimates. Using the same data as 
before and performing two-sided estimates with A'^ — 10^ 
work values in each direction, we obtain the displayed 
values /i^'^oi ^iS- HOI In order to compare the per- 
formance of two-sided estimators for different maps, we 
estimate the overlap measures ^oi ■ The latter are shown 
in Fig. [21 The maximum value for Aoi is found with m 
being 0.0005. This indicates that to « 0.0005 is the opti- 
mal choice for m. The estimates A^i are used to calculate 
the mean square errors mseoi of the estimates h'^^qi ■ The 
square roots of the mseoi enter in Fig. [TOlas error bars. 

We are left to check the convergence properties of two- 
sided estimators. Fig. [Tl] displays the convergence mea- 
sure a{N) for some parameter values m. Best conver- 
gence is found for m = 0.0005 (not shown in Fig. [11] but 
very similar to to = 0.001). The same value of the map- 
ping parameter to was found to maximize the overlap 
Aoi- 

Employing the optimal value 0.0005 for the mapping- 
parameter and using A^ = 10^ forward and reverse sam- 
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N 

FIG. 11: Convergence measure a(N) of two-sided estimates 
for some parameter values m, depending on the sample size 
A'^. A faster decrease of a towards the value indicates a 
faster convergence of the two-sided estimator. 

pies, we have computed the chemical potential. The re- 
sults are given in Table UlTl The listed error is the square 
root of the msegi according to Eq. with Ad = Ao\. 
This is justified with the observed values of the conver- 
gence measure a which are listed in the table, too. 

TABLE III: Two-sided estimates of the excess chemical 

potential of a Lennard- Jones fluid (p* = 0.9, T* = 1.2). Also 
listed is the two-sided overlap measure Aoi and the conver- 
gence measure a. For the targeted estimate the radial map- 
ping function tprn with m — 0.0005 is used. The number of 
work values in each direction is TV = 10*' and the number of 
particles in the simulation Np — 216. 







lO^'iol 


a 


traditional 


1.88 ±0.08 


1.2 


0.05 


targeted 


1.91 ±0.03 


9.5 


-0.02 



It should be mentioned that the optimal value of m 
found here is not universal, but depends on the density 
p* . If another value is chosen for p*, the optimal m can 
again be found from numerical simulations. Note that 
the maps used here can be applied to simulations where 
particles are inserted and deleted at random too. 
One simply has to use the point of insertion (deletion) 
as temporary origin of the coordinate system and apply 
the map there. This might enhance the efficiency of the 
simulation. 



VIII. CONCLUSION 

The central result of this paper, a fluctuation theorem 
for generalized work distributions, allowed us to estab- 



lish an optimal targeted two-sided estimator of the free 
energy difference AF. We have numerically tested this 
estimator and found it to be superior with respect to one- 
sided and non-targeted estimators. In addition we have 
demonstrated that this estimator can be applied success- 
fully to estimate the chemical potential of a Lennard- 
Jones fluid in the high density regime. 

In order to use the targeted two-sided estimator it is 
however crucial to use a suitable map. We have investi- 
gated the construction of maps and developed appropri- 
ate measures which enabled a quantitative comparison 
of the performance of different maps. Especially, a mea- 
sure for the convergence of the two-sided estimate was 
designed. 

This paths the way for better results when free energy 
differences or chemical potentials need to be estimated 
numerically. 
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APPENDIX: CONSTRAINT MAXIMUM 
LIKELIHOOD DERIVATION OF THE 
TWO-SIDED ESTIMATOR 

Deriving the optimal estimator of A_F, given a col- 
lection of no forward {W^°} and rii reverse {W'/} work- 
values drawn from p(VF|0) and p(VF|l), respectively, 
leads to Bennett's acceptance ratio method [§] with the 
target map included. 

In Section llVl the mixed ensemble is introduced, where 
the elements are given by pairs of values (W, Y) of work 
and direction, and which is specified by the probabili- 
ties of direction py and the densities p(W^|y). With the 
mixture ensemble, the mixing ratio — can be chosen ar- 
bitrarily. Crucial about the mixture ensemble is that, 
according to the fluctuation theorem p^ . the analytic 
form of the conditional probabilities p(F|VF) can be de- 
rived explicitly, regardless of whether p(VF|y) is known, 
see Sec. IIVI This provides a natural way to construct a 
constraint maximum likelihood estimator [H, |2j [25| for 
AF. 

Since it is only possible to draw from the ensembles 
piW\Y), but not from piY\W), F = 0, 1, the proper 
log-likelihood is 

ln£-^lnp(W^O|0)+^lnp(I^/|l). (A.l) 

i=i j=i 

A direct maximization of (jA.ip with respect to AF is 
impossible without knowledge of the analytic form of 
the probability densities p(W^|F). However, according 
to Bayes theorem ([M]) the log-likelihood can be split into 

In £ = In £post {AF) + In £prior + In Cp^ ( A.2) 
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with which can be solved in p(W) , 

ln£po,t(AF) = ^lnp(0|MA") + ^lnp(l|M^i), (A.3) Efc '^(^ " ^fe) m 

^=l = 1 ' Xp{l\W)+fl ' ^ ' ' 



no+ni 
In /Iprior — ^ ^ 
fc=l 



Inp(M^fc), 



and 



ln£i, 



no In ■ 



rii In ■ 



Pi 



(A.4) 



(A.5) 



where the sum in the prior hkehhood (jA.4[) runs over aU 
n observed forward and reverse work values. 

Since the definite form of p{W) is unknown, we treat it 
in the manner of an unstructured prior distribution and 
maximize ()A.2| with respect to the constant AF and to 
the function [2^. Thereby, 

1 = J p{W)dW (A.6) 



and 



Pi 



p{l\W)p{W)dW 



(A.7) 



enter as constraints. Using Lagrange parameters A and 
/i, the constrained log-likelihood reads 



InC = hiC + X{pi - I p{l\W)p{W)dW) 

+ [ piW)dW). (A.8) 



The conditional direction probabilities are 
known explicitly in dependence of AF, Eq. ((28)) . 



and their partial derivatives read -g lnp{0\W) — 
-p(l\W) and l^\np{l\W) ^p{0\W) = l-p{l\W). 
This allows to extremize the constraint log-likelihood 
(jA.Sp with respect to AF, 



= - 



1 d 



no -I- Til 



PdAF 



k=l 



A / {1 - p{l\W))p{l\W)p{W)dW. (A.9) 



Extremizing the conditional likelihood (|A.8p with respect 
to the function p{W) gives 







6p{W) 



Inr 



1 " 

-^^5(M^-T4^,)-AMir)-, 



(A.lO) 



fc=i 



or written as 



Xp{l\W)p{W) = -MW) + S{W ~ Wk). (A.12) 



If interested in the values of the Lagrange multipliers 
A and /z, one multiplies Eq. (jA.lOp with p(W) and inte- 
grates. This yields 



= n — Api — /i. 



(A.13) 



A second independent equation follows from inserting 
Eq. (|A.12p into Eq. (IA.9P which results in 

= ni + /i — /ipi — 71, (A. 14) 

and the Lagrange multipliers take the values 



no npo- no 
— , and A — . 



Po 



PoPi 



(A.15) 



With the distribution (jA.lip the constraints (|A.6p and 
(|A.7p read 



and 



Pi 



E 



piMWk) 



1 



\pil\Wk) + ^i 



(A.16) 



Pi 



Xpil\Wk)+l^ ni 



^pB(llW^fe), (A.17) 



where p^{l\W) denotes p{l\W) with C = AF + ^\n^. 
Whenever the constraint (jA.lTp is fulfilled, the constraint 
(|A.16p and the variational equations (|A.9p and (jA.lOp 
are automatically satisfied. In consequence, Eq. (|A.17P 
defines the constrained maximum likelihood estimate of 
AF. Note that the estimator (jA.17p is independent of 
the choice of Moreover, Eq. (|A.17p is equivalent to 
Eq. ((29)) regardless of the choice of ^i. . 

An alternative derivation of the estimator (|A.17p was 
presented by Shirts et al. [ll|. There, the specific choice 
2i = ^ was necessary. With this choice, the Lagrange 
parameter A is identical to zero. Hence, there is no need 
to take any constraint into consideration and the poste- 
rior log-likelihood (IA.3P results directly in the estimator 
of AF. 
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